Sud-Est Development Region
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (6 more...)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > Romania > Sud-Est Development Region > Tulcea County > Tulcea (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Formalized Hopfield Networks and Boltzmann Machines
Cipollina, Matteo, Karatarakis, Michail, Wiedijk, Freek
Neural networks are widely used, yet their analysis and verification remain challenging. In this work, we present a Lean 4 formalization of neural networks, covering both deterministic and stochastic models. We first formalize Hopfield networks, recurrent networks that store patterns as stable states. We prove convergence and the correctness of Hebbian learning, a training rule that updates network parameters to encode patterns, here limited to the case of pairwise-orthogonal patterns. We then consider stochastic networks, where updates are probabilistic and convergence is to a stationary distribution. As a canonical example, we formalize the dynamics of Boltzmann machines and prove their ergodicity, showing convergence to a unique stationary distribution using a new formalization of the Perron-Frobenius theorem.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Netherlands > Gelderland > Nijmegen (0.04)
- Europe > Germany (0.04)
- (6 more...)
- North America > Canada > Alberta (0.14)
- Europe > Romania > Sud-Est Development Region > Tulcea County > Tulcea (0.04)
- Asia > Middle East > Jordan (0.04)
Towards Formalizing Reinforcement Learning Theory
In this paper, we formalize the almost sure convergence of $Q$-learning and linear temporal difference (TD) learning with Markovian samples using the Lean 4 theorem prover based on the Mathlib library. $Q$-learning and linear TD are among the earliest and most influential reinforcement learning (RL) algorithms. The investigation of their convergence properties is not only a major research topic during the early development of the RL field but also receives increasing attention nowadays. This paper formally verifies their almost sure convergence in a unified framework based on the Robbins-Siegmund theorem. The framework developed in this work can be easily extended to convergence rates and other modes of convergence. This work thus makes an important step towards fully formalizing convergent RL results. The code is available at https://github.com/ShangtongZhang/rl-theory-in-lean.
- Europe > Romania > Sud-Est Development Region > Tulcea County > Tulcea (0.05)
- North America > United States > Virginia (0.04)
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
- Asia > Middle East > Jordan (0.04)
FFT-based Dynamic Subspace Selection for Low-Rank Adaptive Optimization of Large Language Models
Modoranu, Ionut-Vlad, Safaryan, Mher, Schultheis, Erik, Ryabinin, Max, Chumachenko, Artem, Alistarh, Dan
Low-rank optimization has emerged as a promising direction in training large language models (LLMs) to improve running time and reduce the memory usage of adaptive optimizers by constraining learning to a lower-dimensional space. Prior work typically projects gradients of linear layers using approaches based on Singular Value Decomposition (SVD) or QR-decomposition. Applying these techniques individually to each layer in large models is computationally expensive and incurs additional memory costs due to storing the projection matrices. In this work, we propose a computationally efficient and conceptually simple, two-step procedure to approximate SVD/QR-based gradient projections into lower-dimensional spaces by using a predefined orthogonal matrix of the Discrete Cosine Transform (DCT). We dynamically select columns from the DCT matrix based on their alignment with the gradient of each layer. The effective projection matrices are obtained via a simple matmul with the DCT matrix in $O(n^3)$ time, followed by a lightweight sorting step to identify the most relevant basis vectors. For large layers, DCT can be computed via Makhoul's $N$-point algorithm based on Fast Fourier Transform (FFT) in $O(n^2 \log(n))$ time. Due to the predefined nature of the orthogonal bases, they are computed once at the start of training. Our numerical experiments on both pre-training and fine-tuning tasks demonstrate the effectiveness of our dual strategy in approximating optimal low-rank projections, obtaining an approach with rank-independent running time that matches the performance of costly SVD/QR-based methods while achieving faster runtime and reduced memory usage by up to $25\%$ across different model sizes. Our code is available at \href{https://github.com/IST-DASLab/ISTA-DASLab-Optimizers}{\texttt{https://github.com/IST-DASLab/ISTA-DASLab-Optimizers}}.
- Asia > Middle East > Jordan (0.04)
- Europe > Romania > Sud-Est Development Region > Constanța County > Constanța (0.04)
- Europe > Austria (0.04)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (7 more...)
In-Context Learning for Pure Exploration
Russo, Alessio, Welch, Ryan, Pacchiano, Aldo
We study the problem active sequential hypothesis testing, also known as pure exploration: given a new task, the learner adaptively collects data from the environment to efficiently determine an underlying correct hypothesis. A classical instance of this problem is the task of identifying the best arm in a multi-armed bandit problem (a.k.a. BAI, Best-Arm Identification), where actions index hypotheses. Another important case is generalized search, a problem of determining the correct label through a sequence of strategically selected queries that indirectly reveal information about the label. In this work, we introduce In-Context Pure Exploration (ICPE), which meta-trains Transformers to map observation histories to query actions and a predicted hypothesis, yielding a model that transfers in-context. At inference time, ICPE actively gathers evidence on new tasks and infers the true hypothesis without parameter updates. Across deterministic, stochastic, and structured benchmarks, including BAI and generalized search, ICPE is competitive with adaptive baselines while requiring no explicit modeling of information structure. Our results support Transformers as practical architectures for general sequential testing.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Massachusetts (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (4 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.66)
- Education (0.67)
- Health & Medicine > Therapeutic Area (0.46)
Russia-Ukraine war: List of key events, day 1,314
Can Ukraine restore its pre-war borders? Why are Tomahawk missiles for Ukraine a'red line' for Russia? Is Russia testing NATO with aerial incursions in Europe? At least 4 killed in major Russian drone, missile attack on Ukraine's Kyiv Russia's President Vladimir Putin said his forces are prevailing in what he described as a "righteous battle" in Ukraine . "Our fighters and commanders go on the attack, and the entire country, all of Russia, is waging this righteous battle and working hard," he said.
- Asia > Russia (1.00)
- North America > United States (0.35)
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.25)
- (22 more...)
- Government > Regional Government > Europe Government > Russia Government (1.00)
- Government > Regional Government > Asia Government > Russia Government (1.00)
- Government > Military (1.00)
Canonical Representations of Markovian Structural Causal Models: A Framework for Counterfactual Reasoning
Counterfactual reasoning aims at answering contrary-to-fact questions like "Would have Alice recovered had she taken aspirin?" and corresponds to the most fine-grained layer of causation. Critically, while many counterfactual statements cannot be falsified--even by randomized experiments--they underpin fundamental concepts like individual-wise fairness. Therefore, providing models to formalize and implement counterfactual beliefs remains a fundamental scientific problem. In the Markovian setting of Pearl's causal framework, we propose an alternative approach to structural causal models to represent counterfactuals compatible with a given causal graphical model. More precisely, we introduce counterfactual models, also called canonical representations of structural causal models. They enable analysts to choose a counterfactual assumption via random-process probability distributions with preassigned marginals and characterize the counterfactual equivalence class of structural causal models. Using these representations, we present a normalization procedure to disentangle the (arbitrary and unfalsifiable) counterfactual choice from the (typically testable) interventional constraints. In contrast to structural causal models, this allows to implement many counterfactual assumptions while preserving interventional knowledge, and does not require any estimation step at the individual-counterfactual layer: only to make a choice. Finally, we illustrate the specific role of counterfactuals in causality and the benefits of our approach on theoretical and numerical examples.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > New York (0.04)
- (3 more...)
- Research Report > Experimental Study (1.00)
- Research Report > Strength High (0.86)
- Health & Medicine > Consumer Health (0.48)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.34)